-
Notifications
You must be signed in to change notification settings - Fork 740
Add 16a4w_block QAT config #15878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add 16a4w_block QAT config #15878
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15878
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit 6dfc4ac with merge base d13c789 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@winskuo-quic can you review and approve this diff? |
|
I think for QAT testing we can use pseudo labels generated by the FP32 model, do a few mini training steps on the fake-quant model, and then compare outputs against the FP32 baseline (pseudo labels) within acceptable atol/rtol thresholds as usual. |
eb2e9f9 to
0a8fd5c
Compare
Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's `convert`. `_derived_bias_quant_spec` also looks for it to correctly derive bias scale. Reviewed By: viveknayakatmeta Differential Revision: D87194388
0a8fd5c to
d26dc52
Compare
Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's `convert`. `_derived_bias_quant_spec` also looks for it to correctly derive bias scale. Reviewed By: viveknayakatmeta Differential Revision: D87194388
|
I added a test under It seems that this is caused by the
Another strange thing is that this error doesn't happen when we export the real model we use (however that export script is internal and I can't share it unfortunately. |
|
If I manually convert this line
to be a Python list it seems to work, perhaps some pybind dependencies weren't being included when running the test? |
Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's `convert`. `_derived_bias_quant_spec` also looks for it to correctly derive bias scale. Reviewed By: viveknayakatmeta Differential Revision: D87194388
d26dc52 to
6dfc4ac
Compare
|
Please ignore my question above about NumPy array, our internal test runner was missing a pybind dependency. |
Summary: Introduce a FakeQuantizer subclass. It falls back to LPBQ observer's
convert._derived_bias_quant_specalso looks for it to correctly derive bias scale.Open to suggestions on how to test. Naveen launched a QAT run and it seems to produce reasonable results.
Differential Revision: D87194388